Seal call recognition based on general regression neural network using Mel-frequency cepstrum coefficient features
نویسندگان
چکیده
Abstract In this paper, general regression neural network (GRNN) with the input feature of Mel-frequency cepstrum coefficient (MFCC) is employed to automatically recognize calls leopard, ross, and weddell seals widely overlapping living areas. As a feedforward network, GRNN has only one parameter, i.e., spread factor. The recognition performance can be greatly improved by determining factor based on cross-validation method. This paper selects audio data above three kinds compares machine learning models for inputting MFCC features low-frequency analyzer recorder (LOFAR) spectrum. results show that at same signal-to-noise ratio (SNR), result better than LOFAR spectrum, which verified statistical histogram. Compared other models, still achieve effective low SNRs. Specifically, accuracy 97.36%, 93.44%, 92.00% 88.38% cases an infinite SNR 10, 5 0 dB, respectively. particular, least training testing time. Therefore, all proposed method excellent seal call recognition.
منابع مشابه
Artificial Neural Network & Mel-Frequency Cepstrum Coefficients-Based Speaker Recognition
Speaker recognition is the process of automatically recognizing who is speaking on the basis of individual information included in speech waves. This technique makes it possible to use the speaker’s voice to verify their identity and control access to services such as voice dialing, banking by telephone, telephone shopping, database access services, information services, voice mail, security co...
متن کاملModified Mel-frequency Cepstrum Coefficient
This paper describes the principle of MFCC feature extraction and the knowledge of human auditory masking effect in order to introduce a modified-MFCC feature extraction that can improve the robustness of speech recognition systems.
متن کاملText Independent Automatic Speaker Recognition System Using Mel-Frequency Cepstrum Coefficient and Gaussian Mixture Models
The aim of this paper is to show the accuracy and time results of a text independent automatic speaker recognition (ASR) system, based on Mel-Frequency Cepstrum Coefficients (MFCC) and Gaussian Mixture Models (GMM), in order to develop a security control access gate. 450 speakers were randomly extracted from the Voxforge.org audio database, their utterances have been improved using spectral sub...
متن کاملSpeaker recognition model using two-dimensional mel-cepstrum and predictive neural network
This paper describes a speaker recognition model using TwoDimensional Mel-Cepstrum and predictive neural network. The speaker model consists of two networks. The rst one is a self-organizing VQ map(Kohonen's feature map). The second part is the predictive network and learns transitional patterns on the feature map of each speaker's model. TDMC consists of averaged features and dynamic features ...
متن کاملRobust Speech Recognition Using Perceptual Wavelet Denoising and Mel-frequency Product Spectrum Cepstral Coefficient Features
To improve the performance of Automatic Speech Recognition (ASR) Systems, a new method is proposed to extract features capable of operating at a very low signal-to-noise ratio (SNR). The basic idea introduced in this article is to enhance speech quality as the first stage for Mel-cepstra based recognition systems, since it is well-known that cepstral coefficients provided better performance in ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: EURASIP Journal on Advances in Signal Processing
سال: 2023
ISSN: ['1687-6180', '1687-6172']
DOI: https://doi.org/10.1186/s13634-023-01014-1